Combining Multiple Alignments to Improve Machine Translation

نویسندگان

Zhaopeng Tu

Yang Liu

Yifan He

Josef van Genabith

Qun Liu

Shouxun Lin

چکیده

Word alignment is a critical component of machine translation systems. Various methods for word alignment have been proposed, and different models can produce significantly different outputs. To exploit the advantages of different models, we propose three ways to combine multiple alignments for machine translation: (1) alignment selection, a novel method to select an alignment with the least expected loss from multiple alignments within the minimum Bayes risk framework; (2) alignment refinement, an improved algorithm to refine multiple alignments into a new alignment that favors the consensus of various models; (3) alignment compaction, a compact representation that encodes all alignments generated by different methods (including (1) and (2) above) using a novel calculation of link probabilities. Experiments show that our approach not only improves the alignment quality, but also significantly improves translation performance by up to 1.96 BLEU points over single best alignments, and 1.28 points over merging rules extracted from multiple alignments individually.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages

We present a novel method to improve word alignment quality and eventually the translation performance by producing and combining complementary word alignments for low-resource languages. Instead of focusing on the improvement of a single set of word alignments, we generate multiple sets of diversified alignments based on different motivations, such as linguistic knowledge, morphology and heuri...

متن کامل

Combining many alignments for speech to speech translation

Alignment combination (symmetrization) has been shown to be useful for improving Machine Translation (MT) models. Most existing alignment combination techniques are based on heuristics, and can combine only two sets of alignments at a time. Recently in [1], we proposed a power mean based algorithm that can be optimized to combine an arbitrary number alignment tables simultaneously. In this pape...

متن کامل

Improving Word Alignment with Bridge Languages

We describe an approach to improve Statistical Machine Translation (SMT) performance using multi-lingual, parallel, sentence-aligned corpora in several bridge languages. Our approach consists of a simple method for utilizing a bridge language to create a word alignment system and a procedure for combining word alignment systems from multiple bridge languages. The final translation is obtained b...

متن کامل

Alignment Symmetrization Optimization Targeting Phrase Pivot Statistical Machine Translation

An important step in mainstream statistical machine translation (SMT) is combining bidirectional alignments into one alignment model. This process is called symmetrization. Most of the symmetrization heuristics and models are focused on direct translation (source-to-target). In this paper, we present symmetrization heuristic relaxation to improve the quality of phrasepivot SMT (source-[pivot]-t...

متن کامل

Combining Outputs from Multiple Machine Translation Systems

Currently there are several approaches to machine translation (MT) based on different paradigms; e.g., phrasal, hierarchical and syntax-based. These three approaches yield similar translation accuracy despite using fairly different levels of linguistic knowledge. The availability of such a variety of systems has led to a growing interest toward finding better translations by combining outputs f...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Combining Multiple Alignments to Improve Machine Translation

نویسندگان

چکیده

منابع مشابه

Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages

Combining many alignments for speech to speech translation

Improving Word Alignment with Bridge Languages

Alignment Symmetrization Optimization Targeting Phrase Pivot Statistical Machine Translation

Combining Outputs from Multiple Machine Translation Systems

عنوان ژورنال:

اشتراک گذاری